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ABSTRACT 

The relatively large number of students who perform 
poorly in freshman chemistry courses signals the need for the 
identification of criteria that will result in correct placement 
decisions for incoming college students. Research findings have 
generally reported placement criteria that correlate significantly 
with performance in college chemistry coursework however, 
predictions of course grades have tended to be very inaccurate 
because most of this research focused on the development of 
regression models in which a single predictor was utilized. The 
study reported in this paper used the meth of discriminant 
analysis to predict membership of the target sample of freshman 
students into one of two groups: those that received a grade of 
"A, " "B, " or "C" in their first semester of Chemistry; an those 
that received a grade of "D, " "F, " or "W. • An acceptable 
statistical model, in terms of assumptions on normality and 
homogeneity of variance, was developed from discriminant functional 
analyses and multiple regression analyses of data from previous 
freshman classes. The factors that were identified as best 
predictors of performance in college Chemistry — in order of 
standardized relative weightings — were the mathematics score on 
the student ! s college entrance examination, high school grade 
point average (GPA) , course grade in high school chemistry, high 
school mathematics GPA, and course grade in high school English. 
The model correctly predicted the discriminant group for almost 
three out of four students (73.7%). (18 references) (J JK) 
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Introduction 

The relatively large number of students who perform poorly 
in freshman chemistry is a persistent problem for college 
chemistry teachers. Previous authors (1-16) have reported 
placement criteria which correlate significantly with 
performance in chemistry. These researchers have developed 
statistical models designed to predict grades in college 
chemistry using as predictors high school records r standardized 
test scores f and results of placement examinations. 

The mostly commonly reported predictors of performance are 
Scholastic Aptitude Test (SAT) scores f American College Testing 
Program (ACT) scores f high school grades in chemistry and 
mathematics r high school rank or grade point average r and 
placement examinations developed at the authors' institutions. 
Of these variables f the SAT mathematics test score has been 
found to have one of the highest correlations with grades in 
college chemistry (1-3). It has been observed that grades of 
"C" or less in high school chemistry tend to predict low grades 
in college chemistry,, but that higher grade alone may or may 
not predict success (4) . 

Most of the studies have reported the development of simple 
regression models in which a single predictor (e.g. , SAT 
mathematics score r high school chemistry grade r score on 
placement test) was correlated with grades in college chemistry. 
Although moderate correlation coefficients (0.40) have been 



Based on the author's doctoral dissertation of the same 
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obtained, predictions of grades themselves have tended net to 
be very accurate. For example, an SAT mathematics score of 700 or 
higher may predict an "A" or "B" in chemistry, and a score 
below 

500 a "D" or "F" but scores between 500 and 700 may predict 
nothing. A student with an SAT score in the 500-700 range may 
earn any grade from "A" to "F" (5) . 

Multiple regression equations include more than one 
predictor variable. Multiple correlation coefficients tend 
to be higher than simple correlation coefficients . Still, 
predicting an "A" or an "F" (i.e., grades at the extreme ends 
of 

the range) appears to be more successful than predicting a "C. " 
In this study a different — and possibly more powerful — approach 
was taken. Rather than predicting performance along a 
continuum of "A" to "F, " discriminant function analysis was 
employed to assign membership in one of two discrete groups — 
students who would be predicted to earn a "C" or better in 
chemistry ("successful" students) , and those predicted to earn 
less than a "C" ("unsuccessful" students) . 

The Problem 

Before fall, 1985, all engineering students at the 
author's institution had to complete a 2-semester sequence in 
general chemistry . Since fall, 1985, engineering students have 
been required to complete a 5-credit course, PS 110, Chemistry 
for Engineers, generally in their first semester. This course 
assumes as prerequisite knowledge mu h of the material found 
in the first one-third of general chemistry textbooks like 
those 



authored by Brown and LeMay, Masterton, and Slowinskl, and Brady 
and Humiston. PS 110 is an accelerated course covering states 
of matter , sol ions, kinetics, thermodynamics,, equilibrium , 
electrochemistry, and introductory organic chemistry. To take 
PS 110 students must have taken high school chemistry and pass 
a placement test, or first complete a preparatory chemistry 
course, or transfer in chemistry credit from another college or 
university. 

Virtually all incoming engineering students have taken high 
school chemistry, but high school courses vary widely in level 
of presentation and topics covered. Therefore, neither the 
fact of having taken high school chemistry nor the grade earned 
suffices for a placement decision. The problem in this study 
was that too many students perform poorly in PS 110 if they lack 
sufficient background for an accelerated chemistry 
course. Therefore, criteria are needed that will result 
in correct placement decisions . 

The Purpose 

The purpose of this study was to contribute to solving the 
problem of students being unsuccessful in freshman chemistry by 
identifying factors which may be used to predict performance . 
The plan was to collect existing data on gr des earned in high 
school, scores earned on aptitude and performance tests, and 
demographic data obtained on students prior to matriculation. A 
discriminant function was developed that was intended to assign 
membership in one of two groups: (1) those who received an 11 A, " 



"B, " or "C" in their first semester of chemistry; (2) those 
who received a "D," "F," or "W. " Only the grade earned in a 
student r s first attempt was used; grades received in subsequent 

attempts were excluded. It is believed that this is the first 

time the methods of discriminant analysis have been applied to 

the problem of predicting performance in freshman chemistry. 

Sample Groups 

Data were collected on 980 engineering students who had 
taken freshman chemistry between fall, 1980, and spring, 1989. 
Most of the students were white males, 18-20 years old, whose 
home towns were outside Arizona. Females, older students, and 
ethnic minorities (including foreign students) comprised about 
15 to 20% of the population. 

Variables 

Independent variables measuring aptitude or achievement 
included the following: high school grades in chemistry, 
mathematics, physics, and English; high school grade point 
average (GPA) ; high school class rank; ACT and SAT scores; and 
scores on the Nelson-Denny Reading Test. Demographic variables 
included the following: age, gender, ethnicity, state or 
country in which student ' s high school was located, size of city 
in which high school was located, type of high school (public or 
private) , years since high school graduation, number of 
extracurricular activities in which student participated in high 



school, and number of transfer credits from other post-secondary 
institutions . 

Other variables were used as well. The GPA was calculated 
for high school mathematics courses, and individual grades in 
high school courses were coded as binary variables. As an 
example of the latter, the variable, " y A' in Chemistry, " was 
coded with a "1" if an had been earned, and with a "0" if 
less than "A" had been earned or if chemistry was not taken in 
high school . 

Methodology 

All computations were performed on the Honeywell mainframe 
computer at Northern Arizona University, Flagstaff, Arizona, 
using the Statistical Package for the Social Sciences (SPSS X : 
17). Initially, descriptive statistics were obtained on all 
variables to check for univariate normality and for outliers. 
Chi-square tests were run to test the possibility that 
differences in students • performances might have been due to 
demographic characteristics rather than to differences in 
scholastic aptitude or ability. Pearson correlation 
coefficients (r) were computed for all pairwise combinations 
of the independent variables among themselves and with the 
criterion, performance in freshman chemistry. Only those 
variables whose correlation coefficients with performance were 
greater than 0.30 were found later to be important in the 
analyses. One-way analysis of variance (ANOVA) tests were 
performed for English, mathematics, and composite SAT/ACT test 



score means for groups of students differentiated by their 

letter grades in chemistry. Multiple-regression analyses were 

run on various combinations of predictor variables in an attempt 

to identify statistically-significant variables and to validate 

the assumption of multivariate normality. 

Discriminant function analysis of data from past classes 

was used to develop the statistical model which could then to be 

used as a placement tool for future students. Discriminant 

analysis is a multivariate statistical procedure in which linear 

combinations of variables are used to distinguish between two or 

more mutually elusive categories of cases. The variables 

"discriminate" between groups of cases and predict into which 

category or group a case falls based upon the values of these 

variables (18). In this study f there were two groups — students 
who had earned a "C" or better and students who had earned less 
than a "C. " To be a valid analysis , there were three underlying 
assumptions : (l) variables had to represent samples drawn from 

a multivariate normal population; (2) sample variance/ covariance 

matrices had to be equal; (3) there could not be any cases of 

multi-collinearity or singularity in the data. SPSS* includes 

tests which showed that the first two assumptions were 

satisfied. By carefully choosing combinations of variables that 

themselves were not too highly correlated or linear combinations 

of each other r the third assumption was satisfied. 

It was felt that the best discriminant model would be one 

which predicted correctly the greatest number of cases and which 



ended itself most easily to interpretation . Various subsets of 
variables and of subjects (students) were tried in order to find 
the optimum combination. The main precaution was to avoid 
combinations of variables which gave redundant information 
either by being too highly correlated or by bring linear 
combinations of one another. 

Findings 

All variables were found to have univariate normal 
distributions and, collectively , to be mult i-var lately normally 
distributed. 

The chi-square tests showed that r with respect to gender, 
ethnicity, type of high school, state or country, or number of 
extracurricular activities, the null hypothesis of no 
relationship was retained, i.e., no relationship existed at the 

5% level of significance (p = 0.05) which might have threatened 
the internal validity of the study. With respect to size of 
city, the null hypothesis ■ as rejected at the 5% level . 
Students from rural high schools appeared to outperform students 
from suburban or urban schools, although the difference was 
still too small for size of city to be significant in the 
discriminant function finally obtained. 

The Pearson r correlations between grade in college 
chemistry and several of the predictor variables were in the 
range 0.30 or higher. These correlations are shown in Table 
1. All other predictor variables that were tried had values 
of r less than 0.30. 



Approximately 260 students had taken both the SAT and the 
ACT. Most of the remaining students <with the exception of 
those from foreign countries) had taken one of the two. For 
students who had taken both tests, correlations between 
corresponding scores were found to be quite high: r = 0.623 on 
verbal tests , r = 0 . 721 on mathematics tests , and r = 0 . 774 on 
composite scores. Regression equations were developed that 
would convert one set of scores to the other , so that a single, 
•generalized 1 ' test score could be used in the statistical 
analyses. For example f if a student took the SAT, his or her 
SAT mathematics score was used. If, however, he or she took the 
ACT, but not the SAT, then the ACT mathematics score was 
converted to an equivalent SAT score using the following 
regression formula : 

SAT Math = 14.385 * ACT Math + 200. 
The ANOVA tests demonstrated that the assumptions of 

normality and homogeneity of variance were satisfied. For the 

numbers of degrees of freedom, each F-ratio was found to be 

significant at the 0.01 level of significance, i.e., the higher 

the letter grade earned in freshman chemistry, the higher a mean 

standardized test score would be for the group which earned that 

grade. As an example, the summary table for means of 

mathematics scores is given in Table 2. Although the analysis 

suggests a significant difference in group means, it should be 

noted that the range of test scores is so large for each letter 



grade that an SAT mathematics score alone would not be a very 
reliable predictor of performance. 

The multiple regression analyses were run using the 
stepwise method of entering variables into the equation. The 
overall high school GPA and the SAT/ACT mathematics test score 
were found to be the most important predictors of grade in 
college chemistry. The grade earned in high school chemistry was 
of lesser importance. No other variables contributed to the 
regression model. The multiple R was found to equal 0.428. The 
regression equation that was developed was the following,, where 
all variables are expressed in raw-score form: 

Grade = 0.5176 * HS GPA + 0.004516 * Math Test 
+ 0.2299 •HS Chem Grade - 3.165. 

Using this model the percentage of correct predictions of grade 

in college chemistry was calculated. The model tends to under 

predict performance^ to the extent that a grade of "A" was 

never predicted. The number of correct predictions was only 

about 30 - 35%. 

The results of the preceding statistical analyses were used 
to select appropriate combinations of predictor variables and 
subgroups of students for the discriminant analysis. A summary 
of the discriminant function model felt to best represent the 
data is given in Table 3. The factors that were identified as 
best predicting performance in chemistry — in order of relative 
beta weights (standardized coefficients> — were found to be the 
SAT/ACT mathematics test score r high school GPA r grade in high 



school chemistry, high school mathematics GPA, and grade in high 
school English. There were 395 students for whom data were 
complete on all these variables. The canonical correlation 
coefficient was 0.471, and 73.7% of the cases were correctly 
classified. 

Classification is accomplished by comparing a student ' s 
calculated value of the function "D" to the average of the group 

centroids <-0.227). If D < -0.227, the student is predicted to 
earn a D, F or W in chemistry and is placed into the preparatory 
course. If D > -0.227, the student is predicted to earn an A, B 
or C and is placed into PS 110. 

The records were examined of students ho were not 
classified correctly. Among students whose performance was 
lower than had been predicted, it was found that almost all of 
them had received "D's," with only a few "W's" and almost no 
"F f s." This pattern suggests that they really did have the 
ability to be successful if they had only worked a little 
harder. Among students whose performance was higher than had 
been predicted, almost all the grades were "C's," suggesting 
that these students managed to put forth the additional effort 
needed to be successful. Few of these students earned higher 
than a "C. " Those students who did receive a higher grade 
tended to have been older than traditional freshmen and to have 
transferred credit from other institutions . 
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Conclusion 

The discriminant function model that has been developed 
fairly successfully explains the performances of past students 
in freshman chemistry. The usefulness of the model, however, 
lies in its ability to assist in placement decisions for future 
students. As it turned out, this study was very timely. As the 
study was being concluded, it was decided at the author's 
institution not to conduct placement testing anymore for 
incoming engineering students, but to base placement decisions 
on high school records. Therefore, as of this writing (spring, 
1990), the author is in the process of testing the discriminant 
model by having used it to place approximately 300 incoming 
students into either engineering or preparatory chemistry. After 
one to two years of using the discriminant model as a placement 
tool, it is expected that a follow-up study will be conducted to 
evaluate the model's effectiveness. 



Table 1 



Pearson correlation Coefficients 
Between Grade in Chemistry 
and Predictor Variables 







Number of 


Variable 


Coefficient 


Students 


SAT Math 


0.425 


436 


SAT Composite 


0.346 


436 


ACT English 


0.298 


271 


ACT Math 


0.387 


272 


ACT Soc Sci 


0.301 


263 


ACT Nat Sci 


0.414 


263 


ACT Composite 


0.416 


264 


HS Math GPA 


0.406 


514 


HS Overall GPA 


0.428 


434 


HS Class Rank 


0.339 


422 


HS Chemistry 


0.371 


481 


HS English 


0.309 


551 


HS Advanced Math 


0.383 


249 


HS Calculus 


0.427 


116 


"A" in HS Chem 


0.317 


726 


"C" in HS Chem 


-0.303 


726 


"A" in HS English 


0.320 


726 



Table 2 



Summary Table of ANOVA Test of Means of 
Mathematics Test Scores for Each 
Letter Grade in Chemistry 



Grade 


Count 


Group Mean 


Std. Dev. 


Min. 


Max. 


A 


41 


641 . 40 


64.77 


?10 


780 


D 

D 


1 C A 


XQQ 1 Q 


in a q 


AID 


/oU 


c 


174 


581 . 90 


68. 70 


410 


730 


D 


137 


543.33 


73.15 


280 


720 


F, W 


162 


524 . 42 


79.24 


320 


730 


Total 


668 


565.15 


75.24 


280 


780 






Sum of 


Mean 


F 




Source 


D.P. 


Squares 


Squares 


Ratio 


Prob. 


Between Groups 


4 


702, 844 


175, 711 


33.42 


<0.01% 


Within Groups 


663 


3, 485 , 615 


5,257 






Total 


667 


4,188, 459 









Table 3 



Discriminant Function Model to Predict 
Performance in Chemistry 



Grade Number of Cases by Group 
D,F, or W 120 
A, B, or C 275 

Total 395 



Group Means (on an "A" = 4.0 scale) 



Grade 
D, F, orW 
A,B, or C 

Total 



Grade 
D,F, or W 
A,B, or C 

Total 



High School 
GPA 
2.90 
3.35 
3.22 

Standardized 
Mathematics Test 
540 
596 
579 



Chemistry 
Grade 
2.53 
3.18 
2.98 



English 
Grade 
2.62 
3.16 
2.99 



High School 
Mathematics GPA 
2.65 

3. .20 
3.03 



Discriminant Function Coefficients 



Variable 
Mathematics Test 
High School GPA 
High School Chemistry 
Mathematics GPA 

High School English 
(Constant) 



Standardized 
0.41506 
0.28855 
0.24663 
0.23521 

0.19851 



Unstandardize 
d 

0.6983639 

0.2914124 
0.3616198 

0.2536181 
-8.422316 



The Discriminant Function in Unstandardized Form: 

O = 0.00650 * Math Test + 0.58936 * HS GPA + 0.29141 * HS Chem 
+ 0.36168 * Math GPA + 0.25368 * HS English - 8.422316 

Multiple R = 0.411 



(continued on next page) 



Table 3 (continued) 



Discriminant Function Evaluated at Group Means 
(Group Centroids) 



Grade 

D,F or W 
A, B or C 



Value 
-0.80542 
0.35146 



Classification Results 

No. of 

Actual Group Cases 



Predicted Group Membership 
D, F, W A,B,C 



69 
17.5% 



D, F, W 120 
A, B, C 275 
Percent of Cases Classified correctly: 73.7% 



53 
13.4% 



51 
12.9% 

222 
56.2% 



FRJC 
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